Prediction of drug resistance by Sanger sequencing of Mycobacterium tuberculosis complex strains isolated from multidrug resistant tuberculosis suspect patients in Ethiopia

Background Ethiopia is one of the high multidrug-resistant tuberculosis (MDR-TB) burden countries. However, phenotypic drug susceptibility testing can take several weeks due to the slow growth of Mycobacterium tuberculosis complex (MTBC) strains. In this study, we assessed the performance of a Sanger sequencing approach to predict resistance against five anti-tuberculosis drugs and the pattern of resistance mediating mutations. Methods We enrolled 226 MTBC culture-positive MDR-TB suspects and collected sputum specimens and socio-demographic and TB related data from each suspect between June 2015 and December 2016 in Addis Ababa, Ethiopia. Phenotypic drug susceptibility testing (pDST) for rifampicin, isoniazid, pyrazinamide, ethambutol, and streptomycin using BACTEC MGIT 960 was compared with the results of a Sanger sequencing analysis of seven resistance determining regions in the genes rpoB, katG, fabG-inhA, pncA, embB, rpsL, and rrs. Result DNA isolation for Sanger sequencing was successfully extracted from 92.5% (209/226) of the MTBC positive cultures, and the remaining 7.5% (17/226) strains were excluded from the final analysis. Based on pDST results, drug resistance proportions were as follows: isoniazid: 109/209 (52.2%), streptomycin: 93/209 (44.5%), rifampicin: 88/209 (42.1%), ethambutol: 74/209 (35.4%), and pyrazinamide: 69/209 (33.0%). Resistance against isoniazid was mainly mediated by the mutation katG S315T (97/209, 46.4%) and resistance against rifampicin by rpoB S531L (58/209, 27.8%). The dominating resistance-conferring mutations for ethambutol, streptomycin, and pyrazinamide affected codon 306 in embB (48/209, 21.1%), codon 88 in rpsL (43/209, 20.6%), and codon 65 in pncA (19/209, 9.1%), respectively. We observed a high agreement between phenotypic and genotypic DST, such as 89.9% (at 95% confidence interval [CI], 84.2%–95.8%) for isoniazid, 95.5% (95% CI, 91.2%–99.8%) for rifampicin, 98.6% (95% CI, 95.9–100%) for ethambutol, 91.3% (95% CI, 84.6–98.1%) for pyrazinamide and 57.0% (95% CI, 46.9%–67.1%) for streptomycin. Conclusion We detected canonical mutations implicated in resistance to rifampicin, isoniazid, pyrazinamide, ethambutol, and streptomycin. High agreement with phenotypic DST results for all drugs renders Sanger sequencing promising to be performed as a complementary measure to routine phenotypic DST in Ethiopia. Sanger sequencing directly from sputum may accelerate accurate clinical decision-making in the future.


Conclusion
We detected canonical mutations implicated in resistance to rifampicin, isoniazid, pyrazinamide, ethambutol, and streptomycin. High agreement with phenotypic DST results for all drugs renders Sanger sequencing promising to be performed as a complementary measure to routine phenotypic DST in Ethiopia. Sanger sequencing directly from sputum may accelerate accurate clinical decision-making in the future.

Background
Tuberculosis (TB) is still a major public health problem with 10 million incident cases and 1.5 million TB deaths in 2019 globally, of which 24% of the cases are reported from Africa [1]. Efforts to control TB have been confronted by the emergence and transmission of drug-resistant MTBC strains in many geographical areas (e.g., developing countries) [2]. Multidrugresistant tuberculosis (MDR-TB) is one of the major global threats and is defined as resistance to at least rifampicin (RIF) and isoniazid (INH). According to the WHO report, 3.4% of new and 18% of previously treated cases had MDR-TB or RIF resistant (RR)-TB worldwide, and 2.6% of new and 11% of previously treated cases were estimated to have MDR-TB/RR-TB in Africa [1].
Ethiopia is one of the countries with the highest TB, TB/HIV, and MDR TB burdens, with an estimated national TB incidence of 132 per 100,000 population and 108,714 notified new and relapse cases in 2019 [1,2]. According to WHO, the prevalence of MDR/RR TB was estimated at 0.71% in new cases and 12% in previously treated cases [1]. Despite this, studies conducted in the country revealed that the prevalence of MDR-TB ranged from 5% in the Northwestern part of the country to 46.3% in the central part (i.e., Jima and Addis Ababa) [3][4][5][6]. Moreover, our published report from this cohort population showed that the prevalence of MDR-TB among MDR-TB suspect patients in Addis Ababa, Ethiopia was 39.4%, with more than 58% of these patients being resistant to all first-line TB drugs [7].
Drug resistance in MTBC strains arises from mutations in functional genes [8]. These mutations often lead to changes of specific protein regions, e.g. drug binding sites, or occur in promoter regions of genes, resulting in increased transcription [8]. For instance, RIF resistance is associated with mutations found in an 81 bp "hot-spot" region of the gene rpoB, including codons 507 to 533 [9,10]. Mutations associated with INH resistance occur mainly in the gene katG that encodes for a catalase-peroxidase enzyme activating the drug or in the promoter region of the fabG1/inhA operon, which increases the transcription of the drug target protein (InhA) [10,11]. While mutations in the genes rpsL, rrs, and gidB can confer resistance to streptomycin (STR), resistance to ethambutol (EMB) is mediated by mutations found in embB [11][12][13]. Moreover, mutations in the gene pncA are associated with resistance to pyrazinamide (PZA) [11][12][13].
Accurate and rapid drug susceptibility testing (DST) is crucial for appropriate TB treatment [14]. However, the use of phenotypic DST (pDST) is confined to reference or central laboratories in many developing countries [15]. Molecular assays or genotypic DST (gDST), such as Cepheid GeneXpert and Hain MTBDRplusv2.0, on the other hand, interrogate only a few canonical mutations. Thus, it is important to identify which mutations are most prevalent in Ethiopia. For instance, resistance mediating mutations that are not interrogated by commercial molecular tests may lead to false negative results, or particular combinations of mutations may lead to false resistant interpretations [15].
Therefore, the aim of this study was to characterize mutations associated with resistance against first-line anti-TB drugs in MTBC strains isolated from suspected MDR-TB patients in Addis Ababa, Ethiopia, and to compare the performance of DNA-sequencing for detection of resistance in comparison to the routine phenotypic DST method.

Study design and setting
A cross-sectional study was conducted from June 2015 to December 2016 in all health facilities that provide MDR-TB diagnosis services in Addis Ababa city, namely Addis Ababa Regional Referral Laboratory, Saint Peter Hospital, and Teklehaimnot Health Center. We enrolled 226 MDR-TB suspect cases who were culture positive and consented to participate in the study, including TB treatment failure cases, smear-positive cases who had known close contact with a confirmed MDR-TB patient, and new or retreatment cases who remained smear-positive for at least two or three months of treatment, respectively [16].
Besides sputum specimens, we collected socio-demographic, epidemiological, and clinical data from each study participant using a questionnaire. Mycobacterial culture and pDST were performed at the Ethiopian Public Health Institute, National Reference TB Laboratory, whereas Sanger sequencing was performed at the Research Center Borstel in Germany (Fig 1).

Specimen collection and laboratory analysis
Specimen collection. A minimum volume of 5 ml of sputum specimen produced by a deep cough was collected into a sterile wide mouth 50 ml falcon tube from each study participant. All specimens were stored at 2-8˚C at collection sites until transported to the National TB Reference Laboratory using a cold chain [7].
Microscopy examination. All collected samples were subjected to Ziehl-Neelsen (ZN) staining as described previously [7]. Briefly, a smear was prepared using a slide from the mucopurulent part of the sputum, air-dried, and stained. The stained slides were examined using a light microscope for the presence of Acid Fast Bacilli (AFB) [17].
Specimen decontamination and culture. For better yield, Lowenstein Jensen (LJ) and Mycobacteria Growth Indicator Tube (MGIT) culture methods were used. All sputum samples were decontaminated with 4% sodium hydroxide-N-acetyl-l-cysteine (NaOH-NALC) and then neutralized with phosphate-buffered saline (PBS). The decontaminated samples were then inoculated into Mycobacteria Growth Indicator Tubes (MGIT BACTEC™ MGIT 960 tubes (BD Diagnostics, Sparks, MD, USA) at 37˚C [18], and onto LJ slants at 37˚C [19]. The incubated specimens in the BACTEC™ MGIT 960 tube were inspected daily for 42 days maximum to check growth [18]. Similarly, an inspection of the specimens incubated in LJ media was done weekly for eight weeks based on colony growth and morphology [19].

Identification of mycobacteria.
Identification of the grown mycobacteria species was done by using MPT64 antigen detection methods (Capilia TB-neo Becton, Dickinson Diagnostic Systems; Sparks, MD, USA). Briefly, the test device consisted of a sample area, a test area containing the anti-MPB64 antibodies, and a control area where anti-species immunoglobulin antibodies are fixed. The testing method is based on immune-chromatographic principles, in which antibodies labeled with colloidal particles react with target antigens to form a migrating antigen-antibody complex, which is captured by a second fixed antibody. A color reaction takes place when the labeled particles are fixed. The result is interpreted as positive for the MTBC if the color reaction takes place in the test and control areas [20].
Phenotypic drug susceptibility testing. The DST for RIF, INH, EMB, STR, and PZA was performed using the BACTEC™ MGIT 960 method as described previously [7]. Briefly, 0.1 ml of a bacilli suspension with a McFarland standard was inoculated into a vial supplemented with reconstitution solution, and containing 1.0 μg/ml of RIF, 0.1 μg/ml of INH, 5.0 μg/ml of EMB, 1.0 μg/ml of STR, and 100 μg/ml of PZA [18]. Mycobacterium tuberculosis strain H37Rv was used as a sensitive control for susceptibility testing. The result was interpreted when the growth unit value of the growth control reached 400 or more within 4 to 13 days. If the growth unit value of the tube containing the drug being tested was 100 or more, the strain was classified as resistant; if the growth unit value was less than 100, the strain was classified as susceptible. Genomic DNA extraction. Genomic DNA was extracted from MTBC strains by a method described by Somerville et al. [21]. Briefly, a loop full of MTBC colonies was suspended in 400 μl of 10 mM Tris-HCl, 1 mM ethylene-diamine-tetra-acetic acid (EDTA) and heated for 20 minutes at 80˚C. Then 1 mg/ml of lysozyme was added and incubated for 2 hours at 37˚C. This was followed by the addition of proteinase K (0.2 mg/ml) and 10% sodium dodecyl sulfate in distilled, deionized water (1.1%) and incubated at 65˚C for 20 minutes after vortex. After incubation, a mixture of N-acetyl-N, N, N-trimethyl ammonium bromide [40 mM], and NaCl (0.1 M) was added, and then NaCl (0.6 M) was immediately added. The mixture was vortexed until it turned milky and incubated at 65˚C for 10 minutes. A 750 μl chloroform-isoamyl alcohol (24:1) was added, vortexed, and then centrifuged at 13,000 rpm in a microcentrifuge for 5 minutes at room temperature. Then the extracted DNA was precipitated with 70% ethanol and re-suspended in a volume of 30 μl TE buffer. Finally, DNA quality and concentration were determined by a spectrophotometer at an optical density of 260 nm and 280 nm.

Polymerase Chain Reaction and drug target gene sequencing
Polymerase Chain Reaction (PCR) amplification and sequencing of the RIF, INH, EMB, PZA, and STR drugs' targets in MTBC strains was done by using gene-specific primers as described below in Table 1 The amplification was done by programming the thermocycler of Eppendorf™ at the following conditions: 95˚C for 3 minutes for initial denaturation; followed by 40 cycles of denaturation at 95˚C for 1 minute, annealing ranged from 55˚C to 65˚C for 30 seconds or 1 minute (summarized in Table 1 for each gene), and extension at 72˚C for 30 seconds, and the final extension was at 72˚C for 5 minutes. The PCR amplified products were examined on a 1.5% agarose gel electrophoresis using a 100 base pair DNA ladder.
Finally, EXOSAP cleanup of PCR products for sequencing was performed under the following conditions: 5 μl PCR products were mixed with 1μl exonuclease and 1μl alkaline phosphatase, and then the mix was placed in a thermal cycler with the hot lid off. The cycles were performed for 30 min at 37˚C and 15 min at 80˚C [22], and followed by a Sephadex cleanup of the sequence-PCR products. The resulting products were sequenced with their gene-specific forward and reverse single primer extensions to get optimal coverage of the target regions using a Big dye-terminator kit and an ABI Prism 3500lL Genetic Analyzer (Applied Biosystems, USA).

Data analysis
The sequencing data obtained from the ABI3730XL DNA analyzer were imported into SeqS-cape1 software version 2.7 (Applied Biosystems, Foster City, CA) and consensus sequences were generated. The SeqScape1 was used for DNA sequence comparisons, and mutations were detected in the respective genes by comparing them with the reference Mycobacterium tuberculosis strain H37Rv sequence. Likewise, all patient-related information collected, phenotypic drug profiles, and drug target gene mutation data were compiled, entered into an excel sheet, cleared, and categorized as necessary. Descriptive statistics were computed, including frequency and percentage of the socio-demographic, TB exposure and treatment history, antibiotic treatment history, HIV status, alcohol consumption and smoking history, phenotypic drug profiles, and mutations identified from drug target gene data using SPSS version 23 statistical package software (SPSS Inc., Chicago, IL).

Performance of Sanger sequencing for the prediction of drug resistance
Sensitivity, specificity, and overall agreement were calculated in comparison to the phenotypic DST results from the reference standard BACTEC MGIT960 (Becton Dickinson). Any identified mutation in the selected resistance determining regions (Table 1)

Ethical considerations
Scientific and ethical approval for the study was obtained from the Research and Ethical Review Committee of Addis Ababa University and the Ethiopian Public Health Institute. We obtained written and/or oral informed consent from study participants. Confidentiality of the participants' data and test results was maintained throughout the study period using codes.

Results
Overall, we enrolled a total of 226 MTBC MDR-TB suspected cases and successfully isolated MTBC strains from all cultured samples (100%). However, from these, we were able to extract DNA with enough quantity and quality for gDST from 209 (92.5%) strains. Therefore, we excluded 17 strains (study participants) from the final analysis of this study.

Socio-demographic and clinical characteristics
Some socio-demographic and clinical characteristics data of the study participants used in this report were included in our previous report [7]. As shown in Table 2, the majority of MDR-TB suspects were males (59.3%, 124/209), married (59.3%, 124/209), and HIV positive (58.9%,

Phenotypic drug susceptibility tests
The pDST data used herein was included in our previously published report [7]. Table 3

Performance of Sanger sequencing
We further investigated the sensitivity and specificity of the prediction of individual drug resistances and overall agreement (proportion of resistant and susceptible strains) of Sanger sequencing DST for RIF, INH, PZA, EMB, and STR, compared to the phenotypic standard method BACTEC™ MGIT 960 as described in Table 5. Our finding showed that the sensitivity and specificity for RIF were 95.5% with a 95% confidence interval (CI) of 91.2% to 99.8% and 95.9% (95%, CI, 92.4% to 99.4%), respectively, resulting in a concordance of 95.7% (95%, CI, 92.9% to 98.5%). Six discordant resistant results were linked to the mutations in rpoB R529P, L533P, L538P/V, and S531S (silent mutation) ( Table 4).
Regarding the prediction of EMB resistance, 74 strains were classified as having EMB resistance with a sensitivity of 98.6% (95% CI, 95.9 to 100%) and a specificity of 99.3% (95% CI, 97.9 to 100%). The overall agreement of the EMB resistant diagnosis was 99.0% (95% CI, 97.7% to 100%). One discordant resistant result was linked to the mutation embB D311G.

Discussion
We employed Sanger sequencing of MTBC strains from MDR-TB suspects in Ethiopia to investigate the genomic mutations implicated in resistance against RIF, INH, PZA, EMB, and STR. Overall, Sanger sequencing showed high accuracy that ranged from 78.9% for detection of STR resistance to 99.0% for detection of EMB resistance when compared to the phenotypic standard method, the BACTEC MGIT 960 system. This implies that Sanger sequencing has the potential to predict first-line drug resistance among MDR-TB suspects and can be used as a complementary approach to pDST to detect low-level drug resistance in resource-limited countries.
Drug resistance TB is a very important public health threat globally. It is an alarming obstacle to TB care, treatment, and prevention, especially in resource-limited countries [28]. Moreover, it often leads to poor outcomes for TB patients [28]. In this study, the majority of the phenotypic resistance against RIF could be explained by mutations in the rpoB target region. The mutation in rpoB S531L was dominant and detected in 67% of the strains. This finding is similar to the findings of [28] in South Africa. However, in Sudan, a neighboring country to Ethiopia, resistance to RIF was mediated by the rpoB Ser450Leu, His445Tyr, His445Asn, and His445Asp mutations [29]. Moreover, we found mutations in the rpoB gene in six phenotypic RIF susceptible strains, including, four of the mutations detected within the RRDR at R529P (n = 1), L533P (n = 2), and S531S (n = 1) silent mutation) and two outside the RRDR at L538P/V. This could be explained by low-level RIF resistance [30]. The mutations detected within the RRDR at S531S and L533P have been associated with false resistance when using the probe-based gDST method (e.g., GeneXpert MTB/RIF), which may lead to the administration of unnecessary treatment (i.e., overtreatment) [30]. The most prevalent mutation conferring resistance against INH was katG S315T, and it was found in 97% of the MDR-TB strains. The katG S315T mutation has also been found to be dominant elsewhere in the world in countries like Sudan, South Africa, and Vietnam [29,31,32]. It is associated with a low-fitness cost but with clinically significant levels of INH resistance [32,33]. Moreover, strains harboring the katG S315T mutation produce active catalaseperoxidase, tend to be in molecular clusters (i.e., transmissible from patient to patient), and are virulent in TB mouse models [33].
Furthermore, four of the five strains with INH resistance conferring mutations in the promoter region of the fabG1-inhA operon had an additional mutation at katG S315T. The cooccurrence of the katG S315T and the fabG1-inhA promoter mutations would lead to a further increase in the INH resistance level and render ethionamide or prothionamide treatment unsuccessful. Another possibility could be a compensatory effect of the fabG1-inhA promoter mutations in catalase deficient and INH resistant strains. The co-selection of the fabG1-inhA promoter mutations has also been observed in other studies [34][35][36].
Encouragingly, our Sanger sequencing approach, using the presence of mutations in the interrogated embB and pncA genes with regions, resulted in overall sensitivities (> 90%) and specificities (> 95%) for the prediction of EMB and PZA resistance. It is usually difficult to predict with genotypic tests of both drugs due to breakpoint artefacts in EMB resistant strains [37] and the diversity of pncA mutations in combination with challenging PZA test conditions [38][39][40][41].
Moreover, identical pncA mutations in MDR-TB strains from epidemiologically related patients might point towards ongoing transmission [42]. In this study, nearly 28% of the strains harbored the mutation pncA 64fs, while other patient strains showed very diverse and mostly unique pncA mutations. PZA is one of the essential drugs for the treatment of TB, including MDR TB [43]. However, currently, there is no reliable and rapid diagnostic method for the detection of PZA resistant TB, and the pDST method depends on acid PH and has a long turnaround time [43]. Thus, it is important to design or explore a reliable and rapid method. Interestingly, our findings showed that more than 97% of the genetic variants identified in the pncA gene were correlated with phenotypic resistance. Hence, Sanger sequencing could be a reliable and accurate method for the rapid diagnosis of PZA resistant TB [44,45]. Regarding STR, the sensitivity of predicting resistance against this drug was most likely reduced due to the presence of a gidB mutation, which could not be interrogated in this study [46,47].

Conclusion
Overall, our study revealed that Sanger sequencing results can be used as a surrogate marker for pDST against all first-line drugs (INH, RIF, EMB, and PZA) in MDR-TB suspects with high accuracy. We showed that the sensitivity and specificity of this method are within the WHO recommendation for molecular assays. Moreover, Sanger sequencing is able to detect mutations that mediate only a low or moderate resistance increase. It detected many known canonical resistance-associated mutations implicated in resistance against RIF, INH, and EMB, as well as diverse mutations in the pncA gene associated with resistance against PZA. Further studies evaluating the performance of Sanger Sequencing to predict drug resistance profiles from direct patient specimens, e.g., sputum and other body fluids, are desirable. The ability to predict rare mutations (not covered by commercial molecular tests), especially pncA mutations and low-level resistance mutations, such as in rpoB, renders Sanger Sequencing a promising tool to complement routine pDST in MDR-TB suspects.
Supporting information S1 Table. All socio-demographic and phenotypic and genotypic drug susceptibility test results data. (XLSX)